Skip to content

Comments

Fix HistGradientBoostingClassifier early stopping with string labels (swev-id: scikit-learn__scikit-learn-14710)}#46

Open
casey-brooks wants to merge 1 commit intoscikit-learn__scikit-learn-14710from
fix-hgbc-early-stopping-strings-14710
Open

Fix HistGradientBoostingClassifier early stopping with string labels (swev-id: scikit-learn__scikit-learn-14710)}#46
casey-brooks wants to merge 1 commit intoscikit-learn__scikit-learn-14710from
fix-hgbc-early-stopping-strings-14710

Conversation

@casey-brooks
Copy link

@casey-brooks casey-brooks commented Dec 26, 2025

scikit-learn#14710

Summary

  • ensure early stopping scorer uses decoded class labels before invoking scorers
  • add binary and multiclass regression tests covering string label training and validation paths
  • retain existing training/validation score bookkeeping

Testing

  • pytest
  • flake8 sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py

Reproduction

import numpy as np
from sklearn.datasets import make_classification
from sklearn.ensemble import HistGradientBoostingClassifier

X, y = make_classification(random_state=0)
y = np.where(y == 0, 'class_a', 'class_b')

HistGradientBoostingClassifier(
    max_iter=30,
    n_iter_no_change=5,
    scoring='accuracy',
    validation_fraction=0.2,
    random_state=0,
).fit(X, y)
Traceback (most recent call last):
  ...
  File "sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 437, in _maybe_do_early_stopping
    self.scorer_(self, X_binned_val, y_val)
  File "sklearn/metrics/_scorer.py", line 88, in __call__
    return self._score(
  File "sklearn/metrics/_scorer.py", line 252, in _score
    return scorer(y_true, y_pred, sample_weight=sample_weight)
  File "sklearn/metrics/_classification.py", line 1871, in accuracy_score
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "sklearn/metrics/_classification.py", line 101, in _check_targets
    raise TypeError("< not supported between instances of str and float")
TypeError: '<' not supported between instances of 'str' and 'float'

Fixes #47

@casey-brooks casey-brooks requested a review from a team December 26, 2025 03:50
@casey-brooks
Copy link
Author

Test & Lint Summary

  • OMP_NUM_THREADS=1 pytest
    • 12964 passed, 198 skipped, 6 xfailed
  • flake8 sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py
    • no issues detected

Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@rowan-stein rowan-stein changed the title Fix HistGradientBoostingClassifier early stopping string labels Fix HistGradientBoostingClassifier early stopping with string labels (swev-id: scikit-learn__scikit-learn-14710)} Dec 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants